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1 XML and semistructured data querying: XPath lookup queries in P2P networks 
Angela Bonifati, Ugo Matrangolo, Alfredo Cuzzocrea, Mayank Jain 

November 2004 Proceedings of the 6th annual ACM international workshop on Web 
information and data management 

Full text available: ^| pdf(263,77 KB) Additional Information: full citation , abstract , references , index terms 

We address the problem of querying XML data over a P2P network. In P2P networks, the 
allowed kinds of queries are usually exact-match queries over file names. We discuss the 
extensions needed to deal with XML data and XPath queries. A single peer can hold a whole 
document or a partial/complete fragment of the latter. Each XML fragment/document is 
identified by a distinct path expression, which is encoded in a distributed hash table. Our 
framework differs from content-based routing mechanisms, ... 
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2 Multimedia: Peer-to-peer architecture for content-based music retrieval on acoustic 
data 

Cheng Yang 

May 2003 Proceedings of the 12th international conference on World Wide Web 

Full text available: ^ pdf(146.73 KB) Additional Information: full citation , abstract , references , index terms 

In traditional peer-to-peer search networks, operations focus on properly labeled files such 
as music or video, and the actual search is often limited to text tags. The explosive growth 
of available multimedia documents in recent years calls for more flexible search capabilities, 
namely search by content. Most content-based search algorithms are computationally 
intensive, making them inappropriate for a peer-to-peer environment. In this paper, we 
discuss a content-based music retrieval algorithm ... 

Keywords: acoustic data, content-based music retrieval, distributed, load balancing, peer- 
to-peer, resource pooling 



3 Peer-to-peer infrastructure: Pastiche: making backup cheap and easy 
Landon P. Cox, Christopher D. Murray, Brian D. Noble 

December 2002 ACM SIGOPS Operating Systems Review, Volume 36 issue si 
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Backup is cumbersome and expensive. Individual users almost never back up their data, 
and backup is a significant cost in large organizations. This paper presents Pastiche, a 
simple and inexpensive backup system. Pastiche exploits excess disk capacity to perform 
peer-to-peer backup with no administrative costs. Each node minimizes storage overhead 
by selecting peers that share a significant amount of data. It is easy for common 
installations to find suitable peers, and peers with high ove ... 

4 Communication privacy: How to achieve blocking resistance for existing systems 
enabling anonymous web surfing 
Stefan Kopsell, Ulf Hillig 

October 2004 Proceedings of the 2004 ACM workshop on Privacy in the electronic 
society 

Full text available: ^| pdf(897,66 KB) Additional Information: full citation , abstract , references , index terms 

We are developing a blocking resistant, practical and usable system for anonymous web 
surfing. This means, the system tries to provide as much reachability and availability as 
possible, even to users in countries where the free flow of information is legally, 
organizationally and physically restricted. The proposed solution is an add-on to existing 
anonymity systems. First we give a classification of blocking criteria and some general 
countermeasures. Using these techniques, we outline a cone ... 
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5 Reviewed articles: Measuring the evolution of transport protocols in the internet 
Alberto Medina, Mark Allman, Sally Floyd 

April 2005 ACM SIGCOMM Computer Comm unication Review, Volume 35 issue 2 

Full text available: pdfd .48 MB) Additional Information: full citation , abstract , references , index terms 

In this paper we explore the evolution of both the Internet's most heavily used transport 
protocol, TCP, and the current network environment with respect to how the network's 
evolution ultimately impacts end-to-end protocols. The traditional end-to-end assumptions 
about the Internet are increasingly challenged by the introduction of intermediary network 
elements (middleboxes) that intentionally or unintentionally prevent or alter the behavior of 
end-to-end communications. This paper provides mea ... 

Keywords: Internet, TCP, evolution, middleboxes 



6 Handling Heterogeneity in Shared-Disk File Systems Q 
Changxun Wu, Randal Burns 

November 2003 Proceedings of the 2003 ACM/IEEE conference on Supercom puting 

Full text available: Q pdf(268.40 KB) Additional Information: full citation , abstract 

We develop and evaluate a system for load management in shared-disk file systems built 
on clusters of heterogeneous computers. The system generalizes load balancing and server 
provisioning. It balances file metadata workload by moving file sets among cluster server 
nodes. It also responds to changing server resources that arise from failure and recovery 
and dynamically adding or removing servers. The system is adaptive and self-managing. It 
operates without any a-priori knowledge of workload pro ... 



7 The case for TCP/IP puzzles Q 
Wu-chang Feng 

August 2003 ACM SIGCOMM Computer Comm unication Review , Proceedings of the 
ACM SIGCOMM workshop on Future directions in network architecture, 

Volume 33 Issue 4 



http://portal.acm.org/results.cfm?coll=ACM&dl=ACM&CFID=4 5/29/05 



Results (page 1): peer-to-peer network fingerprinting algorithm 



Page 3 of 7 



Full text available: ^pdf(256.51 KB) Additional Information: full citation , abstract , references , citings 

Since the Morris worm was unleashed in 1988, distributed denial-of-service (DDoS) attacks 
via worms and viruses have continued to periodically disrupt the Internet. Client puzzles 
have been proposed as one mechanism for protecting protocols against denial of service 
attacks. In this paper, we argue that such puzzles must be placed within the slim waistline 
of the TCP/IP protocol stack in order to truly provide protection. We then describe several 
scenarios in which TCP/IP puzzles could be ... 



DISP: Practical, efficient, secure and fault-tolerant distributed data storage 
Daniel Ellard, James Megquier 

February 2005 ACM Transactions on Storage (TOS), Volume l issue l 

Full text available: ^pdf(148.11 KB) Additional Information: full citation , abstract , references , index terms 

DISP is a practical client-server protocol for the distributed storage of immutable data 
objects. Unlike most other contemporary protocols, DISP permits applications to make 
explicit tradeoffs between total storage space, computational overhead, and guarantees of 
availability, integrity, and privacy on a per-object basis. Applications specify the degree of 
redundancy with which each item is encoded, what level of integrity checks are computed 
and stored with each item, and whether items are stor ... 



Keywords: Distributed data storage 



9 Turning the postal system into a generic digital communication mechanism 

Randolph Y. Wang, Sumeet Sobti, Nitin Garg, Elisha Ziskind, Junwen Lai, Arvind Krishnamurthy 
August 2004 ACM SIGCOMM Computer Comm unication Review , Proceedings of the 
2004 conference on Applications, technologies, architectures, and 
protocols for computer communications, Volume 34 issue 4 
Full text available: ^ pdf(331.29 KB) Additional Information: full citation , abstract , references , index terms 

The phenomenon that rural residents and people with low incomes lag behind in Internet 
access is known as the "digital divide." This problem is particularly acute in developing 
countries, where most of the world's population lives. Bridging this digital divide, especially 
by attempting to increase the accessibility of broadband connectivity, can be challenging. 
The improvement of wide-area connectivity is constrained by factors such as how quickly 
we can dig ditches to bury fibers in the ground; ... 

Keywords: network architecture, postal network, storage devices 



10 Traffic characterization and SPAM: Measuring interactions between transport protocols Q 
and middleboxes 

Alberto Medina, Mark Allman, Sally Floyd 

October 2004 Proceedings of the 4th ACM SIGCOMM conference on Internet 
measurement 

Full text available: ^ pdf(102.74 KB) Additional Information: full citation , abstract , references , index terms 

In this paper we explore the evolution of both the Internet's most heavily used transport 
protocol, TCP, and the current network environment with respect to how the network's 
evolution ultimately impacts end-to-end protocols. The traditional end-to-end assumptions 
about the Internet are increasingly challenged by the introduction of intermediary network 
elements (middleboxes) that intentionally or unintentionally prevent or alter the behavior of 
end-to-end communications. This paper provides ... 

Keywords: TCP, evolution, internet, middleboxes 
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11 Synthesizing Realistic Computational Grids 
Dong Lu, Peter A. Dinda 

November 2003 Proceedings of the 2003 ACM/IEEE conference on Supercomputing 

Full text available: ^ pdf(224.44 KB) Additional Information: full citation , abstract 

Realistic workloads are essential in evaluating middleware for computational grids. One 
important component is the raw grid itself: a network topology graph annotated with the 
hardware and software available on each node and link. This paper defines our 
requirements for grid generation and presents GridG, our extensible generator. We describe 
GridG in two steps: topology generation and annotation. For topology generation, we have 
both model and mechanism. We extend Tiers, an existing tool from t ... 

12 Tunable randomization for load management in shared-disk clusters 
Changxun Wu, Randal Burns 

February 2005 ACM Transactions on Storage (TOS), volume l issue l 
Full text available:^) pdf(551. 85 KB) Additional Information: full citation , abstract , references , index terms 

We develop and evaluate a system for load management in shared-disk file systems built 
on clusters of heterogeneous computers. It balances workload by moving file sets among 
cluster server nodes. It responds to changing server resources that arise from failure and 
recovery, and dynamically adding or removing servers. It also realizes performance 
consistency— nearly uniform performance across all servers. The system is adaptive and 
self-tuning. It operates without any a priori knowledge o ... 

Keywords: Load management, computer clusters, heterogeneity, shared -disk file systems 
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13 Intrusion detection: Enhancing byte-level network intrusion detection signatures with 
context 

Robin Sommer, Vern Paxson 

October 2003 Proceedings of the 10th ACM conference on Computer and 
communications security 

Full text available: fj ?|pdf(217 88 KB) Additional Information: full citation , abstract , references , citings , index 

: terms 

Many network intrusion detection systems (NIDS) use byte sequences as signatures to 
detect malicious activity. While being highly efficient, they tend to suffer from a high false- 
positive rate. We develop the concept of contextual signatures as an improvement of string- 
based signature-matching. Rather than matching fixed strings in isolation, we augment the 
matching process with additional context. When designing an efficient signature engine for 
the NIDS bro, we provide low-level context ... 

Keywords: bro, evaluation, network intrusion detection, pattern matching, security, 
signatures, snort 



14 Difficulties in simulating the internet 
Sally Floyd, Vern Paxson 

August 2001 IEEE/ACM Transactions on Networking (TON), volume 9 issue 4 

Full text available: f£l pdfd 11.73 KB) Additi o n a' Information: full citation , abstract , references , citings, index 

terms , review 

Simulating how the global Internet behaves is an immensely challenging undertaking 
because of the network's great heterogeneity and rapid change. The heterogeneity ranges 
from the individual links that carry the network's traffic, to the protocols that interoperate 
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over the links, the "mix" of different applications used at a site, and the levels of congestion 
seen on different links. We discuss two key strategies for developing meaningful simulations 
in the face of these difficulties: searching ... 

Keywords: Internet, modeling, simulation 



15 Algorithms for identifying Boolean networks and related biological networks based on Q 
matrix multiplication and fingerprint function 

Tatsuya Akutsu, Satoru Miyano, Satoru Kuhara 

April 2000 Proceedings of the fourth annual international conference on 
Computational molecular biology 

Full text available: ^ pdf(608.05 KB) Additional Information: full citation , abstract , references 

Due to the recent progress of the DNA microarray technology, a large number of gene 
expression profile data are being produced. How to analyze gene expression data is an 
important topic in computational molecular biology Several studies have been done using 
the Boolean network as a model of a genetic network This paper proposes efficient 
algorithms for identifying Boolean networks of bounded indegree and related biological 
networks, where identification of a Boolean network can be formalized ... 

16 Watermarking algorithms: Exploiting self-similarities to defeat digital watermarking 
systems: a case study on still images 

Gwenael Doerr, Jean-Luc Dugelay, Lucas Grange 

September 2004 Proceedings of the 2004 multimedia and security workshop on 
Multimedia and security 

Full text available: ^ pdf(1.27 MB) Additional Information: full citation , abstract , references , index terms 

Unauthorized digital copying is a major concern for multi-media content providers. Since 
copyright owners lose control over content distribution as soon as data is decrypted or 
unscrambled, digital watermarking has been introduced as a complementary protection 
technology. In an effort to anticipate hostile behaviors of adversaries, the research 
community is constantly introducing novel attacks to benchmark watermarking systems. In 
this paper, a generic block replacement attack will be presented. ... 

Keywords: block replacement attack, intra-signal collusion, self-similarities 



17 Information retrieval session 4: general retrieval issues I: Content-based retrieval in 
hybrid peer-to-peer networks 
Jie Lu, Jamie Callan 

November 2003 Proceedings of the twelfth international conference on Information and 
knowledge management 

Full text available* fiQ pdf(262 41 KB) Additional Information: full citation , abstract , references , citings, index 
' ^ '' terms 

Hybrid peer-to-peer architectures use special nodes to provide directory services for regions 
of the network ("regional directory services"). Hybrid peer-to-peer architectures are a 
potentially powerful model for developing large-scale networks of complex digital libraries, 
but peer-to-peer networks have so far tended to use very simple methods of resource 
selection and document retrieval. In this paper, we study the application of content-based 
resource selection and document retrieval to hybri ... 

Keywords: content-based, hybrid, peer-to-peer, retrieval, search 
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Protocols: The Eigentrust algorithm for reputation management in P2P networks 
Sepandar D. Kamvar, Mario T. Schlosser, Hector Garcia-Molina 

May 2003 Proceedings of the 12th international conference on World Wide Web 

Full text available* fijl) pdf(202.87 KB) Additional Information: full citation , abstract , references , citing s, index 
' terms 

Peer-to-peer file-sharing networks are currently receiving much attention as a means of 
sharing and distributing information. However, as recent experience shows, the 
anonymous, open nature of these networks offers an almost ideal environment for the 
spread of self-replicating inauthentic files. We describe an algorithm to decrease the number 
of downloads of inauthentic files in a peer-to-peer file-sharing network that assigns each 
peer a unique global trust value, based on the peer's history of ... 

Keywords: distributed eigenvector computation, peer-to-peer, reputation 



19 XML schemas: integration and translation: A local search mechanism for peer-to-peer Q 
networks 

Vana Kalogeraki, Dimitrios Gunopulos, D. Zeinalipour-Yazti 

November 2002 Proceedings of the eleventh international conference on Information 
and knowledge management 

Full text available* fi3 odf(238 97 KB) Additional Information: full citation , abstract , references , citings , index 
* 12^— 1 : terms 

One important problem in peer-to-peer (P2P) networks is searching and retrieving the 
correct information. However, existing searching mechanisms in pure peer-to-peer 
networks are inefficient due to the decentralized nature of such networks. We propose two 
mechanisms for information retrieval in pure peer-to-peer networks. The first, the modified 
Breadth-First Search (BFS) mechanism, is an extension of the current Gnuttela protocol, 
allows searching with keywords, and is designed to minimize the ... 

Keywords: distributed information retrieval, peer-to-peer netwroks 

20 Distributed content-based visual information retrieval system on peer-to-peer networks Q 
Irwin King, Cheuk Hang Ng, Ka Cheung Sia 

July 2004 ACM Transactions on Information Systems (TOIS), volume 22 issue 3 

Full text available - 1j 3pdf(1.38 MB) Additional Information: full citation , abstract , references , citings , index 

terms 

With the recent advances of distributed computing, the limitation of information retrieval 
from a centralized image collection can be removed by allowing distributed image data 
sources to interact with each other for data storage sharing and information retrieval. In 
this article, we present our design and implementation of DISCOVIR: Distributed COntent- 
based Visual Information Retrieval system using the Peer-to-Peer (P2P) Network. We 
describe the system architecture and detail the interactions ... 

Keywords: Peer-to-peer (P2P) network, content-based image retrieval (CBIR), information 
retrieval, intelligent query routing, peer clustering 
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